Area of mouth opening estimation from speech acoustics using blind deconvolution technique

نویسندگان

  • Cong-Thanh Do
  • Abdeldjalil Aïssa-El-Bey
  • Dominique Pastor
  • André Goalic
چکیده

We propose a new method for estimation of area of mouth opening from a video sequence of the speaking person. In a paper published in 2000, Grant and Seitz have reported the different degrees of correlation between acoustic envelopes and visible movements. In our method, we exploit these correlations to establish a mathematical model of a Single-Input Multiple-Output (SIMO) system in which the area of mouth opening is the unknown Single Input that we need to estimate. The subband Root Mean Squared (RMS) energies of the speech signal are the observable Multiple Outputs of the model. The unknown input signal can be directly estimated by using the existing blind deconvolution techniques. Our method necessitates only an audio sequence to estimate directly the area of mouth opening in the corresponding video sequence. Consequently, using this method permits us to avoid using complex images processing techniques of the conventional visual features extraction methods, or the training of the estimators in the audioto-visual mapping methods. The audio-visual sequences used for the estimation tests have been recorded by an ordinary webcam. Estimation result is promising; the estimated area of mouth opening is sufficiently correlated with the manually measured one; the average of correlation coefficients obtained by the most effective configuration of the proposed method, on a set of 16 French sentences, is 0.73.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of deconvolution techniques using a distribution mixture parameter estimation: Application in single photon emission computed tomography imagery

Thanks to its ability to yield functionally rather than anatomically-based information, the single photon emission computed tomography (SPECT) imagery technique has become a great help in the diagnostic of cerebrovascular diseases which are the third most common cause of death in the USA and Europe. Nevertheless, SPECT images are very blurred and consequently their interpretation is difficult. ...

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

Blind separation and deconvolution for convolutive mixture of speech using SIMO-model-based ICA and multichannel inverse filtering

We propose a new two-stage blind separation and deconvolution (BSD) algorithm for a convolutive mixture of speech, in which a new Single-Input Multiple-Output (SIMO)-modelbased ICA (SIMO-ICA) and blind multichannel inverse filtering are combined. SIMO-ICA can separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources as they are at th...

متن کامل

How far are vowel formants from computed vocal tract resonances?

We compare numerically computed resonances of the human vocal tract with formants that have been extracted from speech during vowel pronunciation. The geometry of the vocal tract has been obtained by MRI from a male subject, and the corresponding speech has been recorded simultaneously. The resonances are computed by solving the Helmholtz partial differential equation with the Finite Element Me...

متن کامل

Blind two-thermocouple sensor characterisation

Thermocouples are one of the most popular devices for temperature measurement in many mechatronic implementations. However, large wire diameters are required to withstand harsh environments and consequently the sensor bandwidth is reduced. This paper describes a novel algorithmic compensation technique based on blind deconvolution to address this loss of high frequency signal components using t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009